在本文中,我们解决了人类3D形状序列的比较和分类的任务。随着时间的推移,人类运动的非线性动力学和表面参数化的变化使这项任务非常具有挑战性。为了解决这个问题,我们建议将3D形状序列嵌入无限的尺寸空间,即Varifolds的空间,并具有来自给定的正定核的内部产品。更具体地说,我们的方法涉及两个步骤:1)表面表示为varifolds,该表示形式将指标等效到刚体运动,而不是参数化的不变性; 2)3D形状的序列由其无限尺寸Hankel矩阵得出的革兰氏矩阵表示。两个人类的两个3D序列的比较问题是作为两个革兰氏赫克矩阵的比较。关于CVSSP3D和DYNA数据集的广泛实验表明,我们的方法在3D人类序列运动检索中与最新的方法具有竞争力。实验代码可在https://github.com/cristal-3dsam/humancomparisonvarifolds上获得。
translated by 谷歌翻译
The dissemination of hateful memes online has adverse effects on social media platforms and the real world. Detecting hateful memes is challenging, one of the reasons being the evolutionary nature of memes; new hateful memes can emerge by fusing hateful connotations with other cultural ideas or symbols. In this paper, we propose a framework that leverages multimodal contrastive learning models, in particular OpenAI's CLIP, to identify targets of hateful content and systematically investigate the evolution of hateful memes. We find that semantic regularities exist in CLIP-generated embeddings that describe semantic relationships within the same modality (images) or across modalities (images and text). Leveraging this property, we study how hateful memes are created by combining visual elements from multiple images or fusing textual information with a hateful image. We demonstrate the capabilities of our framework for analyzing the evolution of hateful memes by focusing on antisemitic memes, particularly the Happy Merchant meme. Using our framework on a dataset extracted from 4chan, we find 3.3K variants of the Happy Merchant meme, with some linked to specific countries, persons, or organizations. We envision that our framework can be used to aid human moderators by flagging new variants of hateful memes so that moderators can manually verify them and mitigate the problem of hateful content online.
translated by 谷歌翻译
近年来,大型语言模型(LLMS)在自然语言产生中表现出了令人印象深刻的实力。提高发电多样性的一种常见做法是从模型中采样多个输出。但是,缺乏一种简单且可靠的方式来从这些随机样品中选择最佳输出。作为一个案例研究,在问题产生的背景下,我们提出了两种基于迅速的方法,以从一组LLM生成的候选人中选择高质量问题。我们的方法在1)限制下起作用,一个黑框(不可修改)问题生成模型和2)缺乏访问人类宣传的参考文献 - 这两者都是现实世界中LLMS的现实局限性。通过自动和人类评估,我们从经验上证明,我们的方法可以有效地选择比贪婪的生成更高质量的问题。
translated by 谷歌翻译
估计医疗状况的患病率或发生的人口比例是医疗保健和公共卫生中的一个基本问题。准确地估计各组之间的相对患病率(例如,捕获疾病比男性更频繁地影响女性)促进了有效且公平的健康政策,这些政策优先考虑那些受疾病影响不成比例的群体。但是,当医疗状况低估时,很难估计相对患病率。在这项工作中,我们提供了一种基于积极未标记的学习框架的基础,可以准确估计不足以说明的医疗状况的相对患病率。我们表明,在普遍做出的协变量假设下 - 即,以症状为条件的疾病的可能性在整个群体之间保持恒定 - 我们可以恢复相对的患病率,即使没有限制性的假设,通常是在正面的未标记的学习中,即使没有限制性假设无法恢复绝对患病率。我们提供了一系列关于合成和实际健康数据的实验,这些实验证明了我们方法比基线更准确地恢复相对患病率的能力,该方法的鲁棒性具有合理的违反协变量偏移假设的侵犯。
translated by 谷歌翻译
Distribution shifts-where the training distribution differs from the test distribution-can substantially degrade the accuracy of machine learning (ML) systems deployed in the wild. Despite their ubiquity in the real-world deployments, these distribution shifts are under-represented in the datasets widely used in the ML community today. To address this gap, we present Wilds, a curated benchmark of 10 datasets reflecting a diverse range of distribution shifts that naturally arise in real-world applications, such as shifts across hospitals for tumor identification; across camera traps for wildlife monitoring; and across time and location in satellite imaging and poverty mapping. On each dataset, we show that standard training yields substantially lower out-of-distribution than in-distribution performance. This gap remains even with models trained by existing methods for tackling distribution shifts, underscoring the need for new methods for training models that are more robust to the types of distribution shifts that arise in practice. To facilitate method development, we provide an open-source package that automates dataset loading, contains default model architectures and hyperparameters, and standardizes evaluations. Code and leaderboards are available at https://wilds.stanford.edu.
translated by 谷歌翻译
Algorithms are now regularly used to decide whether defendants awaiting trial are too dangerous to be released back into the community. In some cases, black defendants are substantially more likely than white defendants to be incorrectly classi ed as high risk. To mitigate such disparities, several techniques have recently been proposed to achieve algorithmic fairness. Here we reformulate algorithmic fairness as constrained optimization: the objective is to maximize public safety while satisfying formal fairness constraints designed to reduce racial disparities. We show that for several past de nitions of fairness, the optimal algorithms that result require detaining defendants above race-speci c risk thresholds. We further show that the optimal unconstrained algorithm requires applying a single, uniform threshold to all defendants. e unconstrained algorithm thus maximizes public safety while also satisfying one important understanding of equality: that all individuals are held to the same standard, irrespective of race. Because the optimal constrained and unconstrained algorithms generally di er, there is tension between improving public safety and satisfying prevailing notions of algorithmic fairness. By examining data from Broward County, Florida, we show that this trade-o can be large in practice. We focus on algorithms for pretrial release decisions, but the principles we discuss apply to other domains, and also to human decision makers carrying out structured decision rules.
translated by 谷歌翻译